{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "5.7. 使用重复元素的网络(VGG).ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/chongzicbo/Dive-into-Deep-Learning-tf.keras/blob/master/5.7.%20%E4%BD%BF%E7%94%A8%E9%87%8D%E5%A4%8D%E5%85%83%E7%B4%A0%E7%9A%84%E7%BD%91%E7%BB%9C(VGG).ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "kwPNvFfW5rSR",
        "colab_type": "text"
      },
      "source": [
        "##5.7. 使用重复元素的网络（VGG）\n",
        "AlexNet在LeNet的基础上增加了3个卷积层。但AlexNet作者对它们的卷积窗口、输出通道数和构造顺序均做了大量的调整。虽然AlexNet指明了深度卷积神经网络可以取得出色的结果，但并没有提供简单的规则以指导后来的研究者如何设计新的网络。我们将在本章的后续几节里介绍几种不同的深度网络设计思路。\n",
        "\n",
        "本节介绍VGG，它的名字来源于论文作者所在的实验室Visual Geometry Group [1]。VGG提出了可以通过重复使用简单的基础块来构建深度模型的思路。\n",
        "\n",
        "###5.7.1. VGG块\n",
        "VGG块的组成规律是：连续使用数个相同的填充为1、窗口形状为 $3\\times3 $的卷积层后接上一个步幅为2、窗口形状为 $2 \\times 2$ 的最大池化层。卷积层保持输入的高和宽不变，而池化层则对其减半。我们使用vgg_block函数来实现这个基础的VGG块，它可以指定卷积层的数量num_convs和输出通道数num_channels。"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "dWmpISxfSyk3",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "import tensorflow as tf\n",
        "from tensorflow import keras\n",
        "from tensorflow.keras import layers,Sequential\n",
        "from tensorflow.data import Dataset\n",
        "from tensorflow.keras import losses,optimizers\n",
        "import time\n",
        "import numpy as np\n",
        "from tensorflow import image\n",
        "import os\n",
        "import sys"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "SJhNrnYhTES5",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "tf.enable_eager_execution()"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "FESkpKS5TJzT",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "def vgg_block(num_convs,num_channels):\n",
        "  blk=Sequential()\n",
        "  for _ in range(num_convs):\n",
        "    blk.add(layers.Convolution2D(filters=num_channels,kernel_size=3,padding='same',activation='relu'))\n",
        "\n",
        "  blk.add(layers.MaxPool2D(pool_size=2,strides=2))\n",
        "  return blk"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "baQaFrxc6D9Z",
        "colab_type": "text"
      },
      "source": [
        "###5.7.2. VGG网络\n",
        "与AlexNet和LeNet一样，VGG网络由卷积层模块后接全连接层模块构成。卷积层模块串联数个vgg_block，其超参数由变量conv_arch定义。该变量指定了每个VGG块里卷积层个数和输出通道数。全连接模块则跟AlexNet中的一样。\n",
        "\n",
        "现在我们构造一个VGG网络。它有5个卷积块，前2块使用单卷积层，而后3块使用双卷积层。第一块的输出通道是64，之后每次对输出通道数翻倍，直到变为512。因为这个网络使用了8个卷积层和3个全连接层，所以经常被称为VGG-11。"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Uqzo2gPbTac5",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "conv_arch=((1,64),(1,128),(2,256),(2,512),(2,512))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "McHwXuX36NRw",
        "colab_type": "text"
      },
      "source": [
        "下面我们实现VGG-11。"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "jYh40xW-Tbzo",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "def vgg(conv_arch):\n",
        "  net=Sequential()\n",
        "  for (num_convs,num_channels) in conv_arch:\n",
        "    net.add(vgg_block(num_convs,num_channels))\n",
        "  net.add(layers.Flatten())\n",
        "  net.add(layers.Dense(4096,activation='relu'))\n",
        "  net.add(layers.Dropout(0.5))\n",
        "  net.add(layers.Dense(4096,activation='relu'))\n",
        "  net.add(layers.Dropout(0.5))\n",
        "  net.add(layers.Dense(10))\n",
        "  return net"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "p3czr5Sv6Rbs",
        "colab_type": "text"
      },
      "source": [
        "下面构造一个高和宽均为224的单通道数据样本来观察每一层的输出形状。"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "J5zjntZwTjZ6",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "net=vgg(conv_arch)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "_GRpFdCpT42N",
        "colab_type": "code",
        "outputId": "a218cc3c-3752-4e92-f785-34a89ffd020e",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 202
        }
      },
      "source": [
        "X=tf.random.uniform(shape=(1,224,224,1))\n",
        "for blk in net.layers:\n",
        "  X=blk(X)\n",
        "  print(blk.name,'output shape:\\t',X.shape)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "sequential_35 output shape:\t (1, 112, 112, 64)\n",
            "sequential_36 output shape:\t (1, 56, 56, 128)\n",
            "sequential_37 output shape:\t (1, 28, 28, 256)\n",
            "sequential_38 output shape:\t (1, 14, 14, 512)\n",
            "sequential_39 output shape:\t (1, 7, 7, 512)\n",
            "flatten output shape:\t (1, 25088)\n",
            "dense_18 output shape:\t (1, 4096)\n",
            "dropout_10 output shape:\t (1, 4096)\n",
            "dense_19 output shape:\t (1, 4096)\n",
            "dropout_11 output shape:\t (1, 4096)\n",
            "dense_20 output shape:\t (1, 10)\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "mavXVURx6Vy5",
        "colab_type": "text"
      },
      "source": [
        "可以看到，每次我们将输入的高和宽减半，直到最终高和宽变成7后传入全连接层。与此同时，输出通道数每次翻倍，直到变成512。因为每个卷积层的窗口大小一样，所以每层的模型参数尺寸和计算复杂度与输入高、输入宽、输入通道数和输出通道数的乘积成正比。VGG这种高和宽减半以及通道翻倍的设计使得多数卷积层都有相同的模型参数尺寸和计算复杂度。"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "9k4Dcc8I6d32",
        "colab_type": "text"
      },
      "source": [
        "###5.7.3. 获取数据和训练模型\n",
        "因为VGG-11计算上比AlexNet更加复杂，出于测试的目的我们构造一个通道数更小，或者说更窄的网络在Fashion-MNIST数据集上进行训练。"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "agZ-8QvNV_ab",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "ratio = 4\n",
        "small_conv_arch = [(pair[0], pair[1] // ratio) for pair in conv_arch]\n",
        "net = vgg(small_conv_arch)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Je8rIieIWTuR",
        "colab_type": "code",
        "outputId": "a9f7e1a2-13ca-4b2e-bbd7-e7f7eadfeece",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 306
        }
      },
      "source": [
        "lr,num_epochs,batch_size=0.05,5,128\n",
        "optimizer=optimizers.SGD(learning_rate=lr)\n",
        "loss=losses.SparseCategoricalCrossentropy(from_logits=True)\n",
        "buffer_size=1000\n",
        "def load_data_fashion_mnist(batch_size,buffer_size):\n",
        "  (x_train,y_train),(x_test,y_test)=keras.datasets.fashion_mnist.load_data()\n",
        "  x_train=x_train[:,:,:,np.newaxis]#将三维张量增加一个channel维\n",
        "  x_test=x_test[:,:,:,np.newaxis]\n",
        "  train_iter=Dataset.from_tensor_slices((x_train,y_train)).map(lambda x,y:(x/255,y)).shuffle(buffer_size).batch(batch_size).map(lambda x,y:(image.resize(images=x,size=(224,224)),y))\n",
        "  test_iter=Dataset.from_tensor_slices((x_test,y_test)).map(lambda x,y:(x/255,y)).batch(batch_size).map(lambda x,y:(image.resize(images=x,size=(224,224)),y))\n",
        "  return train_iter,test_iter\n",
        "train_iter,test_iter=load_data_fashion_mnist(batch_size=batch_size,buffer_size=buffer_size)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "WARNING:tensorflow:Entity <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b9ae8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to identify source code of lambda function <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b9ae8>. It was defined on this line: train_iter=Dataset.from_tensor_slices((x_train,y_train)).map(lambda x,y:(x/255,y)).shuffle(buffer_size).batch(batch_size).map(lambda x,y:(image.resize(images=x,size=(224,224)),y))\n",
            ", which must contain a single lambda with matching signature. To avoid ambiguity, define each lambda in a separate expression.\n",
            "WARNING: Entity <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b9ae8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to identify source code of lambda function <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b9ae8>. It was defined on this line: train_iter=Dataset.from_tensor_slices((x_train,y_train)).map(lambda x,y:(x/255,y)).shuffle(buffer_size).batch(batch_size).map(lambda x,y:(image.resize(images=x,size=(224,224)),y))\n",
            ", which must contain a single lambda with matching signature. To avoid ambiguity, define each lambda in a separate expression.\n",
            "WARNING:tensorflow:Entity <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b9f28> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to identify source code of lambda function <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b9f28>. It was defined on this line: train_iter=Dataset.from_tensor_slices((x_train,y_train)).map(lambda x,y:(x/255,y)).shuffle(buffer_size).batch(batch_size).map(lambda x,y:(image.resize(images=x,size=(224,224)),y))\n",
            ", which must contain a single lambda with matching signature. To avoid ambiguity, define each lambda in a separate expression.\n",
            "WARNING: Entity <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b9f28> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to identify source code of lambda function <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b9f28>. It was defined on this line: train_iter=Dataset.from_tensor_slices((x_train,y_train)).map(lambda x,y:(x/255,y)).shuffle(buffer_size).batch(batch_size).map(lambda x,y:(image.resize(images=x,size=(224,224)),y))\n",
            ", which must contain a single lambda with matching signature. To avoid ambiguity, define each lambda in a separate expression.\n",
            "WARNING:tensorflow:Entity <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b97b8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to identify source code of lambda function <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b97b8>. It was defined on this line: test_iter=Dataset.from_tensor_slices((x_test,y_test)).map(lambda x,y:(x/255,y)).batch(batch_size).map(lambda x,y:(image.resize(images=x,size=(224,224)),y))\n",
            ", which must contain a single lambda with matching signature. To avoid ambiguity, define each lambda in a separate expression.\n",
            "WARNING: Entity <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b97b8> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to identify source code of lambda function <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b283b97b8>. It was defined on this line: test_iter=Dataset.from_tensor_slices((x_test,y_test)).map(lambda x,y:(x/255,y)).batch(batch_size).map(lambda x,y:(image.resize(images=x,size=(224,224)),y))\n",
            ", which must contain a single lambda with matching signature. To avoid ambiguity, define each lambda in a separate expression.\n",
            "WARNING:tensorflow:Entity <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b27ad7950> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to identify source code of lambda function <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b27ad7950>. It was defined on this line: test_iter=Dataset.from_tensor_slices((x_test,y_test)).map(lambda x,y:(x/255,y)).batch(batch_size).map(lambda x,y:(image.resize(images=x,size=(224,224)),y))\n",
            ", which must contain a single lambda with matching signature. To avoid ambiguity, define each lambda in a separate expression.\n",
            "WARNING: Entity <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b27ad7950> could not be transformed and will be executed as-is. Please report this to the AutoGraph team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output. Cause: Unable to identify source code of lambda function <function load_data_fashion_mnist.<locals>.<lambda> at 0x7f3b27ad7950>. It was defined on this line: test_iter=Dataset.from_tensor_slices((x_test,y_test)).map(lambda x,y:(x/255,y)).batch(batch_size).map(lambda x,y:(image.resize(images=x,size=(224,224)),y))\n",
            ", which must contain a single lambda with matching signature. To avoid ambiguity, define each lambda in a separate expression.\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "iUqtTsmnWy7L",
        "colab_type": "code",
        "outputId": "6b4a4709-727c-484b-f57a-637ded6ef8e5",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 50
        }
      },
      "source": [
        "net.compile(optimizer=optimizer,loss=loss,metrics=['accuracy'])\n",
        "history=net.fit_generator(train_iter,validation_data=test_iter,epochs=num_epochs)"
      ],
      "execution_count": 0,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Epoch 1/5\n",
            " 90/469 [====>.........................] - ETA: 1:31:23 - loss: 1.8360 - acc: 0.2180"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "ux0_zEIaXyIy",
        "colab_type": "text"
      },
      "source": [
        "5.7.4. 小结\n",
        "* VGG-11通过5个可以重复使用的卷积块来构造网络。根据每块里卷积层个数和输出通道数的不同可以定义出不同的VGG模型。"
      ]
    }
  ]
}